reinforcement learning

Cover image for Enter the Scaling of RL Environments

Apr 9, 2026 · 29 min read Intelligence Cartography

The environment is no longer a passive test harness. It is a data engine. 10 dimensions of scaling, from task generation to multi-agent self-play.

Read Post

Mar 12, 2026 · 36 min read Intelligence Cartography

From The Witness to ARC-AGI-3: “Teaching” Fluid Intelligence via Puzzles

13 interactive puzzle games with 1,872 levels inspired by The Witness, compatible with ARC-AGI-3 SDK and RL-ready via OpenEnv — an open-source training ground for teaching machines fluid intelligence

Read Post

Feb 20, 2026 · 45 min read Intelligence Cartography

Re-visiting Mid-training Stage: for & with Agentic RL

Re-examining mid-training as the strategic centerpiece of the LLM pipeline — how it builds the knowledge foundation for agentic RL, and how RL signals are now flowing backward to improve mid-training itself

Read Post

Feb 13, 2026 · 38 min read Intelligence Cartography

Inside the Agentic RL Training Loop

A Step-by-Step Walkthrough using Slime and SWE-Bench as an Example

Read Post

Jan 15, 2026 · 4 min read Intelligence Cartography

JustTinker: Minimal RLVR for Building Reasoning Models Under $150

Low-Resource RLVR for Transforming Instruct Models into Reasoning Models

Read Post